DOCUMENT RESUME 



ED 248 258 



TM 840 557 



AUTHOR 
TITLE 

INSTITUTION 
SPOHS AGENCY 

PUB DATE 
CONTRACT 
NOTE 

PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



ABSTRACT 



Weiss, David J. 

Computer-Based Measurement of Intellectual 
Capabilities. Final Report. 

Minnesota Univ., Minneapolis. Dept. of Psychology. 

'■fice of Naval Research, Arlington, Va. Personnel 

id Training Research Programs Office. 
Dec 83 

K ')14-76-C-0243 
3up. 

Reports - Research/Technical (143) 
MF01/PC02 Plus Postage. 

Ability; *Adaptive Testing; Adults; *Bayesian 
Statistics; *Computer Assisted Testing; *Individual 
Testing; Latent Trait Theory; Measurement Techniques; 
♦Monte Carlo Methods; Psychometrics; Response Style 
(Tests); Test Construction; Testing Problems; *Test 
Theory 

Marine Corps 



During 1975-1979 this research into the potential of 
computerized adaptive testing to reduce errors in the measurement of 
human capabilities used Marine recruits for a live-testing validity 




examine 



computerized testing; (3) investigate intra-individual 
multidimensionali ty problems in ability testing; (4) e: 
probabilistic responding and free-response methods for computerized 
adaptive testing in order to extract maximum information from each 
test item response; and (5) develop, refine and evaluate new computer 
administered ability tests for special, perceptive, memory, and other 
abilities not now measurable using paper and pencil testing. Monte 
Carlo and Bayesian adaptive testing methods were used in these 
studies. Fifteen major findings, primarily on adaptive testing and 
test administration conditions, and implications for further research 
are given. Abstracts of the 16 research reports for studies for this 
program are given. (BS) 



********************************************************************* 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 

********************************************************************* 



Final Report 

Computer-Based Measurement of Intellectual Capabilities 



u s. DEPARTMENT OF EDUCATION 

NATIONAL INSTIIUTE OF EDUCATION 

n)iH Ai'dNAi n\ suuHC.fcS information 
t { Ml H Ik hk:i 

y , I..,-..: !..is r-»M'ti feproduced <<s 

• (Ml- pi't^un lit tjrqdni/dtion 

. , .i.n.n.- .] ; 

M ... r r,...-.i.-. '•■iv»- iH'iM. ....iih- III inip'ove 



David J. Weiss 



II. r i-ss.iiiiv 'fp't'si'nl iJthi It)] N'E 



December 1983 



Computerized Adaptive Testing Laboratory 

Department of Psycholocy 

UNivERr"^Y OF Minnesota 

Minneapolis MM 55455 

Final Report of Project NR150-382. N0001i^-76-C-U243 

Supported by the 
Personnel and Training Research Programs 
Psychological Sciences Uivision 
Office of Naval Research 

AND THE 

Navy Personnel Research and Development Centlr 



APPROVED for PUtLIC RELEASE; DISTRIBUTION UNLIMITED- 
REPRODUCTION IN /^OLE OR IN PART IS PERMITTED FOR 
ANY PURPOSE Or THE UNITED STATES GOVERNMENT. 



REPORT dck:umentation page 


READ DtSTKUCnONS 
BEFORE CCMPLETIMG FORM 


1 REPORT NUMBER 


i. GOVT ACCESSION NO. 


1. RECIPIENT'S CATALOG MUMBEH 


Final Report: Computer-Based Measurement 
of Intellectual Capabilities 


S. TYPE OF REPORT A PERIOD COVERED 

Final Report. 

September 1976-January 1983 


S. PERFORMINO ORO. REPORT NUMBER 


7. AUTMORW 

David J. Weiss 


T Contract or orant NUMaERr«j 

N0OOU-76-C-O243 


S". PCRPOWIIMO OROAWirATIOli RAIIC AND ADDRCii 

Department of Psychology 
University of Minnesota 
Minneapolis MN 55455 


*6. PRQORAM EUEMENT PROJECT. TASK 
AREA • fQHr UNIT NUMBERS 

P.E. :61553N Froj . :RR042-04 
T.A. : RR042-OA-01 
W.U. : NR150-382 


1 1. CONTROttING OPPtCe NAME AND AO0RE» 

Personnel and Training Research Programs 
Office of Naval Research 

Arlington VA 22217 


IS. REPONT DATE 

December 1983 


20 


U. MORITORIRG AOtNCY NAmC A AODRCSSf II mhHrm$ from CofiMII^ OUIem} 


IS. ftCCURlTY CLASS, fol fftfa mfioft) 

Unclassified 


' \im. PtCLASSiriCATlON/OOWIlORADiNO 


14. DISTRISUTION 5TATEliERT fol Jt«p»r<> 

Approved for public release; distribution unlimited. Reproduction In 
whole or in part is permitted for any purpose of the United States 
Government . 


17. DISTRIBUTION STATCMENT f«l »m mitmamet MKNmrf li» «••* " Ktp***} 


1» SUPPi-CWCNTART NOTES 

This research was supported by funds from the Navy Personnel Research and 
Development Center and the Office of Naval Research, and monitored 
by the Office of Naval Research. 


1». KEY «OIIO» fC«w<i«i» «. fVMTM mtam II nmumttmr IdtnUtr ^ numbm) 

Adaptive Testing Response-Contingent Testing 
Computerized Testing Latent Trait Theory 
Tailored Testing Response Latency 

Individualized Testing Psychological Reactions to Testing 


20 ABSTRACT (Contlaam Pn rvMTM « lOmUIr *r Ntfck mm^) 

The research program's objectives are described, and f.c r^^search approach 
Is summarized and related to the sixteen technical reports completed ur>ier 
this contract. Fifteen major research findings are presented. The implica- 
tions of the research findings and methods for future research In compu- 
terized testing and adaptive testing are described. Also (» ludc-d are 
abstracts of the sixteen technical reports. 



DO Z.Sr,, 1473 



ERIC 



eOlTfON or 1 NOV §i If OMOUKTC 



SKCURttV CL 



coNxerrs 



Objectives 

Approach • ^ 

3 

Major Findings • _ 

Adaptive Testing Strategies ^ 

Ttest Ad«d.ni8tration Conditions ^ 

Other Findings ' 

5 

Itplications for Further Research 

Adaptive Testing Strategies ^ 

ttest Administration Conditions 

Intra-Individual Diaenslonallty , Response Modes, and New 

Abilities 

9 

Research Report Abstracts . 

75- 6. A Simulation Study of Stradaptive Ability Testing ^ 

76- 1. Sbae Properties of a Bayesian Adaptive Ability Testing 

Strategy " •..«. 

76-2. Effects of Tine Liaits on Ttest-Taklng Behavior 10 

76-3. Effects of lanediate Knowledge of Results and Adaptive 

Ttesting on Ability Ttest Perfoiaance. • 1" 

76- 4. Psychological Effects of laaediate Knowledge of Results 

and Adaptive Ability Testing *^ 

77- 1. Applications of Conputerieed Adaptive Testing 12 

77-2. A Comparison of Information Functions of Jfciltiple-Cholce 

and Free-Response Vocabulary Items 

77-3. Accuracy of Perceived Test-Itea Difficulties 13 

77- 4. A Rapid Item-Stearch Procedure for Bayesian Adaptive 

Testing 

78- 2. The kffects of Knowledge of Results and Test Difficulty 

on Ability Tbst Performance and Psychological Reactions 

to Testing 

79- 7. The Person Response Curve: Fit of Individuals to Item 

Characteristic Curve Models 

80- 2. Interactive Computer Administration of a ^atlal Reasoning ^ 

Test • '' ••••• 

80-3. Criterion-Related Validity of Adaptive Testing Strategies . 17 

80- 5. An Alternate-Forms Reliability and Concurrent Validity 

Comparison of Bayesian Adaptive end Conventional Ability 

Tests •••••• ••••••••••••9em»*»»»»»»**»******************** 

81- 2. Effects of Immediate Feedback and Pacing of Item Presenta- 

tion on Ability Tbst Performance and Psychological 

Reactions to Testing • ^® 

83-1. Reliability and Validity of Adapt Ive and Conventional 

Teats In a Military Recruit Population 19 



ERIC 



final report: 

Computer-based Measi;.iement of Intellectual Capabilities 



Objectives 

The objectives of this research program were based on a review of previous 
research literature that Identified the potential of computerized adaptive test- 
ing to reduce at least five kinds of errors In the Beasurenent of human capaci- 



ties: 








I. 


Errors 


due 


to misiMtch of test item difficulty irtth testee ability; 


2. 


Errors 


due 


to the psyctologlcal effects of testing; 


3. 


Errors 


due 


to Inappropriate dimensionality; 


4. 


Errors 


due 


to failure to extract sufficient Information from the testee; 


5* 


Errors 


due 


to over-slmpllstlc conceptualizations of intellectual capablll 




ties. 







within the context of these five sources of error, which act to reduce the pre- 
cision, accuracy and utility of current ability testing procedures, the research 
was designed to: 

1. Extend previous research efforts to Identify the most useful computer-based 
ad.iptlve testing strategies. 

2. Study the psychological effects of computerized adaptive testing, to iden- 
tify those testing conditions which minimize adverse effects and maximize 
positive effects. 

3. Investigate the problem of intra-indlvldual multidiraenslonallty in ability 
testing. 

A. Examine the use of such response modes as probabilistic responding and 

free-response methods for use in computerized adaptive testing in order to 
extract maximum information from each examinee's response to each test 
item. 

5. Develop, refine and evaluate new computer-administered ability tests which 
measure abilities not now nKasurable using paper and pencil ability test- 
ing. 

Research in pursuance of these primary objectives began in September 1975 
and continued through December 1978. A contract extension, funded by the Navy 
Personnel Research and Development Center, was designed to complete a live-test- 
ing validity comparison of adaptive and conventional tests using Marine re- 
cruits. This extension continued the contract through September 1979. Techni- 
cal reports were cOTipleted through January 1983. 



er|c 



5 

I 



- 2 - 



Approach 

The major focus of the research was on the evaluation Of adaptive testing 
strategies by coB|»arl8on of their characteristics trtth each other and with con- 
ventional tests. Both aonte carlo slmilatlon and live testing were used In 
these studies. In Research Report 75-6 the stradaptlve testing strategy was 
exaalned In aonte carlo slaulatlon to evalaste various scoring techniques possi- 
ble with this testing strategy, under various test lengths srel prior inforaation 
conditions. Performance of the stradaptlve testing stipategy was also evaluated 
in live testing (Research Report 80-3) by comparing its validity with that of a 
conventional test and a Bayesian adaptive test. 

The Bayesian adaptive testing strategy was further studied in several re- 
ports. Monte carlo siroilation was used in Research Report 76-1 to exaslne the 
perforaance of this testing strategy under several ite« pool configurations and 
at a nuBber of tei/t lengths. In Research Reports 80-5 and 83-1, the reliability 
and validity of the Bayesian adaptive test was crapatled with that of convention- 
al tests in a college population (80-5) and in a «llltary recruit population 
(83-1). Research Report 77-4 describes a procedure for iaprovlng the efficiency 
of item selection In Bayesian adaptive testing. 

Several other probleras concerned with the application of adaptive tests to 
the loeasurement of abilities were discussed In a symposiuB presented at the 1976 
ocetln' of the Military Testing Association (Research Report 77-1). An overview 
of adaptive testing strategies, presented by McBridc, included a discussion of 
iteo selection strategies, scoring adaptive tests, and probless of evaluating 
adaptive tests. The problem of estimating trait status in adaptive testing 
based on Item response theory approaches was presented by Sympson, including a 
cOTparlson of the characteristics of Bayesian and likelihood-based estimates. 
Vale, in his paper, considered the problem of classifying individuals into dis- 
crete ability categories (e.g., pass-fall); his monte carlo analysis compared 
adaptive and conventional tests designed for making dlchotomous classifications. 

The effects of testing conditions on test performance were investigated in 
a number of live-testing studies. Since computer-administered testing permits 
immediate scoring of an exsdainee's answer to a test question, it bec(Haes possi- 
ble to Inform the examinee iMwdlately after each response is given as to wheth- 
er the answer was correct or Incorrect. This immediate knowledge of results, or 
immediate feedback, was investigated in several studies in terras of its effects 
on ability test performance In adaptive and conventional tests (Research Reports 
76-3 and 78-2), its Interaction with test difficulty (Research Report 78-2) and 
computer versus self-paced test administration (Research Report 81-2), and its 
effects on examinees* reactions to test administration (Research Reports 76-4 
and 81-2). Related studies examined the effects of time limits on test-taking 
behavior (Research Report 76-2) and the accuracy of the perceived difficulty of 
test Items (Research Report 77-3). 

The question of intra-indlvidual dimensionality in performance on ability 
tests was recast within the more general framework of the fit of individuals to 
item response theory (IRT) models. This issue was examined In one study (Re- 
search Report 79-7) in which the predicted and acutal performance of single in- 
dividuals was examined for indications of lack of person fit due to intra-indl- 



ERIC 



6 



- 3 - 



vldual miltldliMsnslonallty or other factors reflecting non-fit to the unidioen- 
sional IRT aodels. 

The use of test item response nodes other than the oultiple-choice iteo was 
examined In one study (Research Report 77-2) which compared test information 
derived from free-response administration to that of the same items administered 
in multiple-choice mode. 

The use of the unique capability of interactive computers to measure abili- 
ties not measurable by paper-and-pencil tests was examined in one study (Re- 
search Report 80-2). An interactive spatial reasoning test was designed based 
on the popular "15 puzEle" in which examinees were required to restructure a set 
of 15 numerals into a target pattern using a minimum number of moves. Examinee 
performance on the test was analysed in terms of such factors as number of moves 
to solution, quality of the moves, and response latencies at each point in the 
testing procedure. 

Major Findings 

The major findings below are generally organized according to the original 
objectives of the research program. iWldltional details are in the Research Re- 
port abstracts. Many of the original Research Reports contain additional impor- 
tant findigs. 

Adaptive Testing Strategies 

1. Monte carlo data comparing the stradaptlve test with non-adaptive approach- 
es to ability testing (Research Report 75-6) shows that the stradaptlve 
test provides more equipredse neasurement than a peaked conventional test- 
As item discriminations increased, the equl precision of the stradaptlve 
test increased relative to that of the conventional test. 

2. A stradaptlve test with an average of 25X fewer items than a conventional 
test obtained significantly higher validities with a college grade-point 
average criterion than did the conventional test (Research Report 80-3). 

3. Monte carlo evaluation of a Bayeslan adaptive testing strategy identified a 
number of psychometric problems in the ability estimates resulting from 
this testing strategy (Research Report 76-1). Bayeslan ability estimates 
were highly correlated with test length, i#rfVc non-lincarly biased for about 
two-thirds of the ability range, and were dependent on the prior ability 
estimate* 

4. Although the monte carlo simulations of the Bayeslan adaptive test identi- 
fied these potential problems with the Bayeslan ability estimates, they 
appeared to have little impact on the reliability and validity of Bayeslan 
ability estimates. Live-testing studies of the Bayeslan adaptive testing 
strategy in a college population showed validities equal to that a conven- 
tional test (Research Report 80-3), and high reliabilities for tests of 2 
to 30 items in length (Research Report 80-5); in the latter study, hoover, 
using a concurrent validity criterion, the conventional test had higher 
vslidlty correlations than the adaptive test. In a military recruit popu- 



ERIC 







- 4 - 



latlon (Sesearch Beport 83*1), the Bayeslmi adaptive test achieved both 
higher validities and higher reliabilities than did a com|>arabIe conven-' 
tlonal test. In this population, a 9-ltC5B Captive test achieved the sasie 
reliability as a 17-ltem conventional test; 10- to Il-ltea adaptive tests 
achieved the saae concurrent validities as 28- to 30-ltra conventional 
tests. 

5. The original form of the Bayeslan adaptive test used an Item-search proce- 
dure that could require excessive asounts of co^utlng tlae for an Interac* 
tlve test administration envlroiasent. A rapid Ites-search procedure %ias 
developed aid shovn to select tY» sane subset of Iteas as the original pro- 
cedure In about one-tenth tlm arount of coaputer ti«». 

6. Different nethods of estlaatlng ability froa adaptive tests have different 
characteristics. Validities In the prediction of college grade-point aver- 
ages froiB a strsdaptlve test were higher for ability estlnates not based on 
IRT methods than they were for W-based ability estlaates (Research Report 
80-3). Wthln the IRT oethods for estlaatlng ability, BayesUn methods are 
slightly order dependent, resulting In slightly different ability estimates 
with the same Items administered In different orders (Syi^son, In Research 
Report 77-1). Bayeslan ability estimates also have different psychometric 
characteristics than do estimates based on maxlsaim' likelihood procedures. 

7. Adaptive tests can be used for classlflcatlim purposes as «fell as for mea- 
surement on a continuous scale. When compare! to conventlOTal tests de- 
signed to make classifications, adaptive tests can classify more accurately 
than conventional tests when It Is necessary to sake wore than a single 
dlchotomous classification based on test scores (Vale, In Research Report 
77-1). 

Test Administration Conditions 

8. An analysis of response latency data shoved that testees approach different 
testing procedures In different ways (Research Report 76-2). The response 
latency data suggest that these different test-taking styles and strategies 
might be potentially useful as moderator or predictor variables in the pre- 
diction of external criteria. 

9. Computer-administered feedback (Ismedlate knowledge of results) on a con- 
ventional test appears to result In enhanced ability test performance for 
testeees of all ability levels (Research Report 76-3). tt»der computer-ad- 
ministered feedback conditions, mean test scores were significantly higher 
for both high- and low-ablllty testees. Ninety percent of college students 
favorably evaluated their experience with computer-administered feedback 
(Research Report 76-4). 

10. Adaptive tests appear to be more Intrinsically motivating for low-ability 
testees (Research Report 76-4), and result in higher ability estlnates (Re- 
search Report 76-3), than sladlarly wlmlnistered conventional tests. This 
suggests that adaptive testing might eliminate some of the undesirable psy- 
chological effects c^racteristic of conventional testing procedures, re- 
sulting in fairer ami more *iiccurate test scores for testees who typically 



ERIC 



- 5 - 



obtain low scores on conventional ability teats. 

U. Iteo-dlfflculty perceptions of college students «ere highly to ob- 

lectlve Indices of test Itwi difficulty (Research Report 77-3). This sug- 
gests that test difficulty, which say differ between conventional and adap- 
tive tests for exaainees of the saiae ability, night be an important factor 
affecting the test perfomance of individuals. 

12. Test difficulty interacted with iamcdiate knowledge of results to produce 
effects on ability estiiiates, but not on psychological reactions to the 
testing conditions (Research Report 78-2). Since difficulty is morn equal 
across ability levels in an adaptive test than in a conventional test, 
these results suggest that the testing environaent of adaptive tests will 
result in fewer sources of error in ability estimates than will convention- 
al ability tests. 

Other Findings 

13 Analysis of person-fit data derived frcwi the person response curve indicat- 
ed that the vast majority of college students studied responded to a set of 
test items in accordance with the 3-paraBeter logistic IRT ^««««;[f 
Report 79-7). The per«>n response curve approach also identified a small 
group of individuals lAose responses to the test items appeared to result 
from an underlying multidimensional ability structure with respect to the 
ability domain studied. 

U. The dependence of adaptive testing on the multiple-choice item will result 
in test scores with less than optimal properties. Analysis of free-re- 
sponse item data indicates that more informative ability estimates can be 
derived from free response items than from the same items administered as 
multiple-choice items and scored by optimal IRT methods; dJ fferences were 
greater for high-ability examinees (Research Report 77-2). 

15. Interactive computer administration of ability test items permits tte de- 
sign and implementation of ability tests using novel item formats, which 
may extend the range of iwasurable abilities beyong those now measurable 
using a dimensional approach. The design and implementation of an interac- 
tive spatial problem-solving test (Research Report 80-2) permitted the mea- 
surement and analysis of a number of problem-solving types of variables 
that described individual differences in probl«8-solving 8tyJ«sj ^ 
variables might be useful as ability kinds of variables, following further 
scudy 'and refinement. 

Implications for Further Research 

The findings and experience of this research progran support the feasibili- 
ty, utility and psychometric advantages of computerized adaptive 
Intellectual capabilities. However, many new questions were raised by the re 
search, and some of the original questions addressed are still in need of fur- 
ther research. 



ERIC 



9 



Research has concentrated on comparison of the stradaptlve and Bayeslan 
adaptive testing strategies with conventional tests. Further research is needed 
(I) comparing these strategies directly with each other, in both live testing 
and in siaiulation. and (2) in conparing these straiegles with other adaptive 
testing strategies, such as an inforaat ion-based itea selection routine. 

All adaptive testing strategy comparisons to date that used raonte carlo 
simulation techniques have oade two assumptions that are not characteristic of 
real data. First, they have assumed that the item pool is characterized by 
items with known parameter values. In real item pools, however, item paraideter 
values are never known, but are always estimated. These estimates are only ap- 
proximations to the true values and, as a consequence, contain soii» degree of 
error, with rather substantial degrees of error for some of the item parameters. 
Since adaptive testing strategies are designed to explicitly select items based 
on these item parameter estimates, the possibility exists that in a real item 
pool with error-laden item paraiMters adaptive tests might perform less optimal- 
ly due to the error in the item parameter estimates. Thus, simulation studies 
should be designed and implemented to experimentally vary the degrees of error 
in item parameter estimates and to evaluate the effects of these errors on the 
performance of adaptive testing strategies, in order to identify the effects of 
these errors on the performance of the testing strategies. 

A second assumption made in all monte carlo comparisons of adaptive testing 
strategies is that the item pool is strictly unidimensional. since only one set 
of item parameter values is used for each item. In real data, however, item 
pools are very rarely strictly unidimensional. Frequently, item pools are char- 
acterised by second and succeeding factors that account for from trivial por- 
tions of the item pool variance to substantial portions of that variance. While 
multidimensional IRT models have no: yet been sufficiently operationalissed to 
permit the estimation of item pararaisters for dimensions beyond the first, it is 
possible to examine the effects of multidinensionality on adaptive testing 
strategies. One approach to studying this problem Is to simulate the adminis- 
tration of adaptive testing strategies with unidimensional item parameters when 
item responses are generated from an underlying multidimensional structure. 
This approach assumes that the dimensionality of the Item responses is the true 
underlying multidimensional structure, while the apparent unidioensionallty of 
the item pool is the result of the item paraaaterisation process applied to it. 
Studies of this type would enable the identification of the degrees and types of 
multldimensionality that could be tolerated by the various adaptive testing 
strategieB without serious degradation of their performance. 

Further live-testing comparisons of adaptive testing strategies are also 
necessary. The four live-testing studies completed under this contract yielded 
somewhat conflicting results. In two of the four studies, adaptive tests ob- 
tained higher validities than conventional tests with a smaller average number 
of items, and in one study with a saaller median number of iteras. In the study 
using military recruits a very clear advantage was obvious for the ptive 
tests beginning at short test lengths. When a large group of college students 
was studied, however, although the expected differences in reliability were ob- 
tained, the conventional test performed better on the concurrent validity crlte- 



10 



V 

-7 



rion. Since the design of the two large-sample studies was similar, differences 
In results could be attributable to differences in the examinees, the item 
pools, or the criterion tests. Additional live-testing stiulles are needed to 
evaluate the effects of these conditions, as well as to evaluate the perforaance 
of other adaptive testing strategies and to evaluate their perfonaance with ad- 
ditional criterion variables. 

Test Administration Conditions 

The research results show that a number of test administration variables 
influence test scores, IRT-based ability estimates, and/or examinees* reactions 
to tests. These include test speededness, test difficulty, and immediate feed- 
back to examinees as to whether their item responses are correct or incorrect. 
Testing strategy (miaptlve versus conventional) also had some effects on test 
performance and reactions, probably due to the differing difficulties of adap- 
tive and conventional tests. Iim»dlate feedback of results appeared to be an 
important potential factor In increasing test-taking motivation and improving 
test scores. 

Studies completed on the effects of test administration conditions have all 
utilized volunteer college students as examinees and have used verbal ability 
items in the tests administered. Since the test-taking motivation of volunteer 
students might differ when tested under conditions where the tests are being 
used for grading or other purposes, future stwiies should examine the effects of 
test aiminlstration conditions when the tests being administered are to be used 
for purposes other than research.^ In addition, the generality of the observed 
effects should be studied on populations other than college students, and using 
other tests in addition to verbal ability tests. Further studies should also 
Include the effects of other adaptive testing strategies as test administration 
conditions, in conjunction with limsedlate knowledge of results. 

Intra-Indlvidual Dimensionality, Response Modes, and New Abilities 

Research in these three areas was only begun during the contract period. 
The person characteristic curve results show that the vast majority of the one 
group of college students studied responded to a set of test items in accordance 
with the three-parameter logistic IRT model. A small group of students was 
identified, however, lAose responses appeared to be reliably divergent from that 
model.** These deviations were ascribed to intra-indlvldual multidimensionality. 
Since the person response curve method was used in only this one study, further 
studies are Indicated. Of importance is the performance in monte carlo siaula- 
tions of the person-fit Indices under conditions of unidiraeTislonallty , the de- 
rivation of appropriate sampling distributions of the person-fit indices, the 
evaluation of alternate person-fit indices, and the effect of test structure 
characteristics (e.g., distributions of Item characteristics) on t. e perfomance 
of person-fit indices. Additional live-testing studies should als« be Imple- 
mented to study the effects of various test administration conditions (e.g., 
interruptions, poor testing conditions, immediate knowledge of results) on in- 
tralndivldual dlnensionality by means of the person response curve and assoc- 
iated indices of person fit. 

Failure to extract cufficient information *f rem an examinee's responses to 



ERIC 



IX 



nultlple-choice test Items can lower the quality of obtained measur^nts. The 
one study completed on this probleo Indicated that the use of free-response 
Items was able to improve the measurement precision of a set of vocabulary items 
beyond that possible fran scoring the same items as polychotomous multiple- 
choice items. Both of these administration/scoring modes provide better mea- 
surement than dlchotomously-flcored nail tlplc-cho ice items. Since this study used 
college students on a single short vocabulary test, further studies are obvious- 
ly needed to examine the generality of the results. In addition, research is 
needed to examine the performance of other alternatives to the dlctotomously- 
scored multiple-choice Item such as probabilistic responding, which are now fea- 
sible when administered by interactive c(»q)uters. 

Interactive computer administration of ability tests makes possible the 
development of a wide range of new kinds of ability tests to supplement the 
standard dimensional ty-based tests currently In use. This project has demon- 
strated th'it Interactive administration of a problem-solving type of test can 
result in substantial amounts of new kinds of data on examinees in addition to 
the tradltl-inal number of items answered correctly. These data can Include In- 
fonaation on problem-solving styles and response latencies that might be indic- 
ative of other individual differences problem-solving variables. Future re- 
search should investigate the psychometric characteristics of these variables, 
including their reliabilities and their contributions to validity, as well as 
examine the utility of the interactive computer for measuring other abilities 
such as spatial, perceptual, and memory abilities which are now possible to be 
measured by computer administration. 



12 



ERIC 



- 9 - 



ERIC 



RESEARCH REPORT ABSTRACTS 

Research Report 75-6 
A SlBulatlon Study of Stradaptive Ability Testing 
C. Devid Vale and David J. Weiss 
December 1975 

A conventional test and two forms of a stradaptive test were administered to 
•^thousands of siiaulated subjects by mlnicoBputer. Characteristics of the three 
tests using several scoring techniques were investigated while varying the dis- 
crimirating power of the items, the lengths of the teats, and the availability 
of prior Information about the testee's ability level. The tests were evaluated 
in terws of their correlations with underlying ability, the amount of informa- 
tion they provided about ability, and the equiprecision of measurenient they ex- 
hibited. Major findings were (1) scores on the conventional test correlated 
progressively less with ability. as item discriminating po*rer was increased 
beyond 8 - l.O; (2) the conventional test provided increasingly poorer equiprec- 
iBi'ui >f measurement as items became more discriminating; (3) these undesirable 
ch<.- ; - eristics were not characteristic of scores on the stradaptive teat; (A) 
the stradaptive test provided higher score-ability correlations than the conven- 
tional test when item discriminations were high; (5) the stradaptive test pro- 
vided more information and better equiprecision of measurement than the conven- 
tional test when test lengths araJ it«B discriminations were the same for the two 
strategies; (6) the use of valid prior ability estimates by stradaptive strate- 
gies resulted in scores which had better measur^nt characteristics than scores 
derived from a fixed entry point; (7) a Bayesian scoring technique implemented 
within the stradaptive testing. strategy provided scores with good measurement 
characteristics; and (8) further research is necessary to develop improved flex- 
ible termination criteria for the stradaptive test. (AD A02O961) 

Research Report 76-1 
Sone Properties of a Bayesian Adaptive Ability Testing Strategy 
James R. McBride and Itevid J. Weiss 
March 1976 

Four monte carlo simulation studies of Owen's Bayesian sequential procedure for 
adaptive mental testing were conductled. Whereas previous simulation studies of 
this procedure have concentrated on evaluating it in terms of tlw correlation of 
Us test scores with simulated ability in a normal population, these four stud- 
ies explored a number of additional properties, both in a normally, distributed 
population and in a distribution-free context. Study 1 replicated previous 
studies with finite item pools, but examined such properties as the bias of es- 
timate, mean absolute error, and correlation of test length with ability. Stud- 
ies 2 and 3 examined the same variables in a number of hypothetical infinite 
item pools, investigating the effects of item discriminating power, guessing, 
and variable vs. fixed test length. Study 4 investigated some properties of the 
Bayesian test scores as latent trait estimators, under three different item pool 
configurations (regressions of item discrimination on Item difficulty). The 
properties of interest included the regression of latent trait estimates on ac- 
tual trait levels, the conditional bias, of such estimates, the information curve 

13 



10 - 



of the trait estimates, and the relationship of test length to ability level. 
The results of these studies indicated that the ability estinates derived from 
the Bayesian test strategy were highly correlated with ability level. However, 
the ability estiaates were also highly correlated with nuaber of iteB» adodnis- 
tered, were non-linearly biased, and provided measurements which were not of 
equal precision at all levels of ability. (AD A022964) 



Research Rejwrt 76-2 
Effects of Tine Limits on Test-Taking Behavior 
T. W. Miller and David J. Weiss 
April 1976 

Three related experimental studies analysed rate and accuracy of test response 
under ime-linit and no-time-llmit conditions. Test Instructions and multiple- 
choice vocabulary items were administered by computer. Stwlent volunteers re- 
ceived monetary rewards under both testing conditions. In the first study, col- 
lege students were blocked into high- and low-ability groups on the basis of 
pretest scores. Results for both ability groups showed higher response rates 
under time-limit conditions than under no-time-limit conditions. There were no 
significant differences between the time-limit and no-time-limlt accuracy 
scores. Similar results were obtained in a second study in which each student 
received both time-limit and no-time-limit conditions. In h third study each 
testee received the same testing condition twice, and higher response rates were 
observed under the time-limit condition; response accuracy remained consistent 
across testing conditions. All three studies showed essentially eero correla- 
tions between response rate and response accuracy. Response latency data were 
also analysed in the three studies. These data suggested the existence of dif- 
ferent test-taking styles and strategies under time-limit -and no-time-limlt 
testing conditions. The results of these studies suggest that number-correct 
scores from time-limit tests are a complex function of response rate, response 
accuiacy, test-taking style and test-taking strategy, and therefore are not 
likely to be as valid or as useful as number-correct scores from no-tiae-limit- 
tests. (AD A024422) 



Research Report 76-3 
Effects of Immediate Knowledge of Results 
and Adaptive Testing on Ability Test Performance 
Nancy E. Bets and David J. Weiss 
June 1976 

•mis study investigated the effects of imaed ate knowledge of results (KR) con- 
cerning the correctness or incorrectness of each item response on a computer-ad- 
ministered test of verbal ability. The effects of KR were examined on a 50-item 
conventional test and a stradaptive ability test and in high- and low-ability 
groups. The primary dependent variable was maximum likelihood ability estimates 
derived from the item responses. Results indicated that mean test scores for 
the High-Ability group receiving KR were higher than for the No-KR group on both 
the conventional and stradaptive tests. For Low-Ability examinees, mean scores 
were higher under KR conditions than under No-KR conditions on both tests, but 
the difference was statistically significant only for the conventional test. 



ERIC 



14 



- 11 - 



However, the higher nean scores of the Low-Ability testees on the stradaptlve 
test Indlcatexl that for lo%r-ability examinees, adaptive testing had the same 
effects on test performance as did the provision of innaedlate KR. Knowledge of 
results did not have significant effects on either response latencies, response 
consistency on the stradaptlve test, or the internal consistency reliability of 
the conventional test. No significant score differences were found on a 44-item 
post-test administered without KR, Indicating that the facilitative effects of 
knowledge of results on test performance were confined to the test in which KR 
was provided. The results of the study were interpreted as indicating the po- 
tential of both immediate knowledge of results and adaptive testing procedures 
to increase the extent to which ability tests measure '•maximum perfomance" lev- 
els. (AD A027147) 



Research Report 76-4 
Psychological Effects of Immediate Knowledge of 
Results and Adaptive Ability Testing 
Nancy E. Bets and David J. Weiss 
June 1976 

This study investigated the effects of providing Immediate knowledge of results 
(KR) and adaptive testing on test anxiety and test-taking motivation. Also 
studied was the accuracy of student perceptions of the difficulty of adaptive 
and conventional tests administered with or without inrnwdlate knowledge of re- 
sults. Testees were 350 college students divided into high- and low-ability 
groups and randomly assigned to one of four test strategies by KR conditions. 
The ability level of examinees was found to be related to their reported levels 
of motivation and to differences in reported motivation under the different 
testing conditions. Low-ability examinees reported significantly higher levels 
of motivation on the stradaptlve test than on the conventional test, vhlle the 
reported motivation of high-ability examinees did not differ as a function of 
ability level. Low^ability testees reported lower motivation with KR than with- 
out KR, while higher ability testees reported higher motivation with KR. Analy- 
sis of the anxiety data indicated that students reported significantly higher 
levels of anxiety on the stradaptlve test than on the conventional test. The 
provision of KR did not result in significant differences in reported .anxiety. 
However, highest levels of anxiety were reported by the low-ability group on tha 
stradaptlve test administered with KR. These results, in conjunction with pre- 
viously reported data on effects of KR on ability test performance, were inter- 
preted as being the result of facilitative anxiety. Students were able to per- 
ceive the relative difficulty of test items with some accuracy. However, per- 
ceptions of the relative degree of test difficulty were much more closely relat- 
ed to actual test score on the conventional test than on the stradaptlve test. 
Over 90% of the students rtacted favorably to the provision of Immediate KR. 
These results suggest that adaptive testing creates a psychological environment 
for testing which is n»re equivalently motivating for examinees of all ability 
levels and results in a greater standardization of the test-taking environment, 
than does conventional testing. (AD A027170) 



ERIC 



15 



- 12 - 



Research Report 77-1 
Applications of Computerized Adaptive Testing 
Jaaes R. McBride, Jaaes B. Sympson, 
C. David Vale, Steven M. Pine, and Isaac I. Bejar 
riited by David J. Weiss 
Hatch 1977 



This symposium consisted of five papers: 

1. Janes R. JteBride: A Brief Overview of Adaptive Testing 

Adaptive testing is defined, and soae of Its item selection and scoring 
strategies briefly discussed. Item response theory, or item characteristic 
curve theory, **!lch is useful for the implementation of adaptive testing Is 
briefly described. The concept of "information" in a test is introduced 
and discussed in the context of both adaotive and conventional tests. The 
advantages of adaptive testing, in terms of the nature of Infomatlon it 
provides, are described. 

2. James B. Sympson: Estimation of Latent Trait Status in Adaptive Testing 
Procedures 

The role of latent trait theory in measurraent for criterion prediction and 
in criterion-referenced measurraent is explicated. It is noted that latent 
trait models allow both normed-referenced and criterion-referenced inter- 
pretations of test performsnce. Ifaing a 3-paramBter logistic test model, 
an example of sequential estimation in a 20-item adaptive test is present- 
ed. After each item is ^ministered, four different ability estimates (two 
likelihood-based and two Bayesian estimates) are calculated. Characteris- 
tics of the four estimation methods are discussed. The Information avail- 
able in the items selected by the adaptive test is compared with the infor- 
mation available from application of latent trait theory, and adaptive 
testing is advocated as a useful approach to human assessment. 

3. C. David Vale: Adaptive Testing and the Problem of Classification 

The use of adaptive testing procedures to make ability classification deci- 
sions (i.e., cutting score decisions) is discussed. D&ta from computer 
simulations comparing conventional testing strategies with an adaptive 
testing strategy are presented. These data suggest that, although a con- 
ventional test is as good as an adaptive test when there is one cutting 
score at the middle of the distribution of ability, an adaptive test can 
provide better classification decisions when there Is tsore than one cutting 
score* Some utility considerations are also discussed. 



4. Steven M. Pine: Applications of Iton Characteristic Curve Theory to the 
Problem of Test Bias 

It is argued that a major problem In current efforts to develop less biased 
tests is an over-reliance on classical test theory. Item characteristic 
curve (ICC) theory, which is based on individual rather than group-oriented 
neasurement, is offered as a more appropriate measurement model. A defini- 
tion of test bias based on ICC theory is presented. Using this definition, 
several empirical tests for bias are presented and demonstrated with real 
test data. Additional applications of ICC theory to the problem of test 
bias are also discussed. 



ERIC 



16 

I 



5. Isaac I. Bejar: Applications of Adaptive Testing in Measuring Achievenient 
and Performance 

The paper reviews two relatively recent developments in psychometric 
theory — the assessment of partial knowledge and research in adaptive test- 
ing. It is argued that the use of non-dichotomous item formats, needed for 
the assessment of partial knowledge, and now made possible by the adminis- 
tration of achievement test items on interactive computers, should result 
in achievement test scores which are a more realistic and precise indica- 
tion of what a student can do. 
(AD A038114) 



Research Report 77-2 
A Comparison of Information Functions of Multiple-Choice 
and Free-Response Vocabulary Items 
C. David Vale and David J. Weiss 
April 1977 

Twenty multiple-choice vocabulary items and 20 free-response vocabulary items 
were administered to 660 college students. The free-response items consisted of 
the stem words of the multiple-choice items. Testees were asked to respond to 
the free-response items with synonyms. A computer algorithm was developed to 
transform the numerous free-responses entered by the testees into a manageable 
number of categories. The multiple-choice and the free-response items were then 
calibrated according to Bock's polychotoraous logistic model. One item was dis- 
carded because of extremely poor fit with the model, and test information func- 
tions were determined from the other 19 items. Higher levels of information 
were obtained from the free-response items over most of the range of abilities 
between 9 - -3.0 to 0 » +3.0. 



Research Report 77-3 
Accuracy of Perceived Test-Item Difficulties 
J. Stephen Prestwood and David J. Weiss 
May 1977 

This study investigated the accuracy with which testees perceive the difficulty 
of ability-test items. Two 41-item conventional tests of verbal ability were 
constructed for administration to testees in two ability groups. Testees in 
both the high- and low-ability groups responded to each multiple-choice item by 
choosing the correct alternative and then rating the item's difficulty relative 
to their levels of ability. Least-squares estimates of item difficulty, which 
were based on the difficulty ratings, correlated highly with proportion-correct 
and latent trait estimates of item difficulty based on a normlng sample. Least- 
squares estimates of testee ability, which were based solely on the difficulty 
perceptions of the testees, correlated significantly with number-correct and 
maximum-likelihood ability scores based on the testees' conventional responses 
to the items. These results show that item-difficulty perceptions were highly 
rfclated to the "objective" indices of item difficulty often used in test con- 
struction and that as testee ability level increased, the items were perceived 
as being relatively less difficult. The relationship between a testee 's ability 



<- 14 - 



and his/her perception of an individual item's relative difficulty appeared to 
he weak. Of sajor Importance was the finding that iteos which were appropriate 
in difficulty levels from a psychorotric standpoint were perceived by the tes- 
tees as being too difficult for their ability levels. The effecrs on testeea of 
tailoring a test such that items are perceived as being uniformly too difficult 
should be investigated. (AD A041084} 



Research Report 77-4 
A Rapid Item-Search Procedure for Bayesian Adaptive Testing 
C. David Vale and David J. Weiss 
May 1977 

An alternative i tor-select ion procedure for use withOiran's Bayesian adaptive 
testing strategy is proposc^d. Thia procedure is, by design, faster than (Xien*8 
original procedure tecause it searches only part (as ccmpared with all) of the 
total item pool. Itoi selections are, however, identical for both a^thods. 
After a conceptual develO|naent of the rapid-search procedure, the supporting 
mathematica are presented. In a sioHilated comparison with three item pools, the 
rapid-search proccMlure required as little as one-tenth the computer time as 
Owen's technique. (AD AO41O90) 



Research Report 76-2 
The Effects of Knowledge of Results and Test Difficulty 
on Ability Test Performance and Psychological Reactions to Testing 
J. Stephen Prestwood and David J. Ifeiss 
Sept«aber 1978 

Students were administered one of three conventional or one of three atradaptive 
vocabulary tests with or without knowledge of results (KR). The three tests of 
each type differed in difficulty, as assessed by the expected proportion of cor- 
rect responses to the test items. Results Indicated that the mean saximiB-llke- 
llhood estimates of individuals' abilities varied aa a Joint ftmction of ]CR-pro~ 
vision ai^ test difficulty. Students receiving KR scored highest on the most- 
difficult test and lowest on the least-difficult test; students receiving no KR 
scored highest on the least-difficult test and did most poorly on the most- 
difficult test. Although the students perceived the differences in test diffi- 
culty, there were no effects on mean student anxiety or motivation scorea at- 
tributable to difficulty alone. Regardless of test difficulty, students reacted 
very favorably to receiving KR, and its provision increased the mean level of 
reporte»l motivation. 



Research Report 79-7 
The Peraon ResjKinse Curve; Fit of Individuals 
to Item Characteristic Curve Models 
Tom E. Trabin and David J. Vtelss 
December 1979 

This study investigated a method of determining the fit of individuals to item 
characteristic curve (ICC) models using the jMsrson response curve (PRC). The 



t 



18 



15 



construction of observed PRCs is based on an individual's proportion correct on 
test ites subsets (strata) that differ systematically in difficulty level. A 
method is proposed for identifying irregularities in an observed PRC by coapar- 
ing it with the expected PRC predicted by the three-paraseter logistic ICC model 
for that individual's ability level. Diagnostic potential of the PRC is dis- 
cussed in teres of the degree and type of deviations of tte oteerved PRC from 
the expected VRC predict«l by the laodel. ' 

Observed PRCs were constructed for 151 college stwlents using vocabulary test 
data on 216 items of wide difficulty range. Data on students' test-taking acti- 
vation, test-taking anxiety, and perceived test difficulty were also obtained. 
PRCs for the students were found to be reliable and to have shapes that were 
prinarily a function of ability level. Three-iiaraaeter logistic oodel expected 
PRCs served as good predictors of observed PRCs for over 90Z of the group. As 
anticipated froB this general overall fit of the observed data to the ICC aodel, 
there were no significant correlations between degree of non-fit and test-taking 
Motivation, te8t*-takli^ anxiety, or perceived test difficulty. Using split-pool 
observed PRCs, a few students were identified who deviated significantly froa 
the expected PRC. 

The results of this study suggested that three-paraaeter logistic expected PRCs 
for given ability levels were good predictors of test response profiles for the 
students in this saaple. Significant non-fit between observed and expected PRCs 
would si^gest the interaction of ^ditional diaensions in the testing situation 
for a given Individual. RecOTsendations are aade for further research on person 
response curves. 



This report describes a pilot study on the developaent and adainistration of a 
Lest Using a spatial reasoning problem, the IS-puzzle. The test utilized the 
on-line capabilities of a real-tioe ccmputer (1) to record an examinee's prog- 
ress on each probloa through a sequence of problem-solving "moves" and (2) to 
collect additional on-line data that might be of relevance to the evaluation of 
examinee performance (e.g. , ntnber of Illegal a.id repeated moves, response la- 
tency trends). The examinees, 61 students in an introductory psychology class, 
were required to type a seqience of moves that w>uld brii% one 4x4 array of 
scrambled numbers (start configuration) into agreeront %rith a second 4x4 array 
(goal configuration), using as few moves as possible. Ds*-a analyses emphasized 
the ccRBparison of several mthods of indexing problem ditiiculty, methods of 
scoring individual performance, and the relationship between response latency 
data, performance, and problem-solving strategy. 

Subjective ratings of the perceived difficulty of replications of the 15-puzzle 
were obtained froa a separate student sample to investigate (1) the subjective 
dimeusions used by students in evaluating the difficulty of this problem type, 
(2) how accurately the actual performance diff:'culty of these problems could be 
evaluated by students, and (3) whether there wtre reliable individual differ- 



Research Report 80-2 
Interactive Computer Administration of a Spatial Reasoning Test 
Austin T. Church and David J. Ueiss 
April 1980 




ERIC 



- 16 



ences 



in diffictaty perceptions related to actual performance differences. 



Keoults of the study suggested that four performance indices might be useful in 
Indexing problem difficulty: <1) «ean ntmber of moves in the sample, (2) pro- 
portion of students solving the probl«a. (3) proportion of students solving the 
problem in the optimal number of moves, and (4) a Special Difficulty Index, de- 
fined as the sample mean number of moves divided by the minimum number of moves 
required. Four alternative methods of scoring total test performance and tw> 
mettods of scoring individual problem performance were studied. The scores that 
twM into account differential nwifliers of oovcs between the optimal and maxlBma 
number allowed were related soswwhat more to performance ratings obtained from 
Independent Judges. 

BxmlnaCion of problem performance indices, the Special Difficulty Index, and 
atudents* perceptions of the difficulty of the test probloM indicated that most 
of the problm were too easy for most students. However, the possibility of 
obtaining a more discriminating sul^t of probloss was suggested by item-total 
"'score correlations obtaliwd for each problra. The data suggested that better 
consistency ml^t be obtained using problems of similar difficulty levels, and 
it was hypothesized that an adaptive test tailoring problems to the ability lev- 
el of each student would increase the reliability of measurement. 

Ifean initial and total "move" latencies for each problem were strongly related 
to soBe of the performsnce Indices of problem difficulty. At the level of indi- 
vidual performance, only total latency or problem solution time was related to 
problem^erformancc. Utency data appeared to confound differences in the abil- 
ity to visualise a sequence of moves and differences In students* work styles. 
Strong evidence for these work styles was found in student consistency of ini- 
tial, average, and total response latency TOasures across all problaas. 

perceived difficulty ratings showed reliable individual differences in the level 
and variability of difficulty perceptions. The data suggested that the individ- 
ual differences found were related to individual differences in ability to visu- 
alise and to maintain a sequence of moves in short-term memory. It was conclud- 
ed that an adequate selection of problem replications should be able to tap 
these differences, resulting in reliable solution performance differences. 

Improvements in problem selection and design were suggested by the deta in this 
study. Future tests of this type should consist of fewer but more difficult 
problems, partlcuUrly problems not permitting reactive, inpulslve rolutions. 
This type of test wuld seem especially appropriate for adaptive adsdnistra- 
tlon: U) scores on problem tsllored to the individual's ability would likely 
be more highly related to each otter, resulting in more highly reliable total 
scores; (2) the motivational aspects of the tests, which seem more taxing and 
potentially frustrating than conventional item formats, would likely be Ita- 
proved, and (3) for most testees equally precise measurements could be obtained 
in shorter periods of tlrc than with conventional test administration. 



ERIC 20 



- 17 - 



Research Report 80-3 
Criterlon-4lelated Validity of Adaptive Te sting Strategies 
Janet G« 'DiOBpson and David J. Weiss 
' Jtme 1980 

Criterion-related validity of two adaptive tests was c(»pared with a convention- 
al test in tuo groups of college students. Students in tooup 1 (N - 101) were 
adsilnlstered a stradaptlve test and a peaked conventional test; students in 
Croup 2 (N - 131) were adalnistered a Bayeslan adaptive test and the ssaie peaked 
conventional test. All tests were coBputer-adainlstered oultiple-ctoice vocabu- 
lary tests; iteaa were selected from the saae pool, but there i«s no overlap of 
-^eas between the adaptive and conventional tests within each group* The strad- 
aptlve test itSB responses were scored using four different aetlnds (two aean 
difficulty scores, a Bayeslan score, and waxlimm likelihood) with two different 
sets of iten Proaeter estlmtes, to study the effects on criterion-related va- 
lidity of scoring nethods and/or itoa paraneter estiaates. Criterion variables 
were high school and college grade-point averages (CPA), and scores on the Aaer- 
Ican College Testing Prograa (ACT) achievement tests. 

Results indicated generally hl^r validities for the adaptive tests; at least 
one aetluMl of scoring the stradaptlve tests resulted in higher correlations than 
the conventional test with seven of the ei^t criterion variables (and equal 
correlations for the eighth), even tt»ugh the stradaptlve test adainistered over 
25X fewer it on, on the average, than did the conventional test. The stradsp- 
tlve test obtained a significantly higher correlation with overall college CPA 
(r - .27) than did the conventional test; when aath CPA was partialled froa 
overall CPA, the aaxiaua correlation for the stradaptlve test with an average 
length of 29.2 iteas was r • .51, while the 40-iteB conventional test correlated 
only .36. The data showed generally higher criterion-related validities for the 
aean difficulty scores on the stradaptlve test in coaparison to the Bayeslan and 
aaxlBua likelihood scores; the different itea peraroter estiaates had no effect 
on validity, resulting in stores that correlated .98 with each other. 

Although the aean length of the Bayeslan adaptive test was 48.7 itm, tte nedi- 
an nuaber of Iteas (35) was less than that of the 40-ite« conventional test. 
Ability estiaates froa this adaptive test also correlated higher with seven of 
the eight criterion variables than did scores on the conventional tests, al- 
ttwugh none of ttw differences were statisticauLly significant. 

These data indicate that adaptive tests can achieve criterion-related validities 
equal to, and in soae cases significantly greater than, those obtained by con- 
ventional tests While adainlstering up to 27X fewer Itema, on the average. The 
data also suggest that latent-trait-based scoring of stradaptlve tests aay not 
be optimal with respect to criterion-related validity. Llaitations of the study 
are discussed and suggestions are aade for additional research. (AD A087595) 



21 



18 - 



Research Report 80-5 
An Altemate«Foms Reliability and Concur rent Validity 
Coaparlson of Bayealan Adaptive and Conventional A bility Teste 
■ G. Gage Kingsbury and David J. Ueiss 
DBCCBber 1980 

TWO 30-lteo alternate fonis of a ojnventional test and a Bayealan adaptive test 
WB-e adalnlstered by computer to 472 undergraduate psychology students. In ad- 
dition, each studait completed a 120-item paper-and-pencil test, which served as 
a concurrent validity criterion test, and a series of very easy questions de- 
signed to detect students who were not answering conscientiously. All test 
Itess %tete five-alternative TOltiple-cholce vocabulary items, aeliabllity and 
concurrent validity of the two testing strategies were evaluated after the ad- 
minlatratJon of each item for each of the tests, so that tr«ids lirflcating dlf- 
f irences in the testl:« strategies as a function of test length could be detect- 
ed. For each test, additional analyses were coialucted to determine idiether the 
two forms of the test were operationally alternate forms. 

Results of the analysis of alternate-forms correspondence indicated that for all 
test lengths greater than 10 lt«, each of the alternate forms for the two test 
types resulted in fairly constant mean ability level estimates. When the scor- 
InTprocedure was equated, the mean ability levels estimated from the two forms 
of the conventional test differed to a greater extent than those estimated from 
the two forms of the Bayealan adaptive test. 

The alternate-forms reliability analysis indicated that the two forau. of the 
Bayealan test resulted in more reliable scores than the two forM of the conven- 
tional test for all test lengths greater than two lt«as. This result was ob- 
served when the conventional test was scored either by the Bayealan or propor- 
tion-correct TOtbod. 

The concurrent validity analysis showed th5^ the conventional test produced 
ability level estlmtes that correlated more highly with the criterion test 
scores than did the Baj-eslan test for all lengths greater than four Itema. This 
result was observed for both scoring procedures used witb the conventional test. 

Umitatlons of the sttMy, ai^ the conclusions that may be drawn from it, are 
discussed. These limitations, which may have affected the results of this 
study. Included possible differences in the alternate foras used within the two 
testing strategies, the relatively small calibration samples used to estimate 
the ICC parameters for the items used in the study, and method variance in the 
conventional tests. (AD A094A7.7) 



Research Report 81-2 
Ef fects of Immediate Feedback and Pac ing of Item Presentation 
on""Abllity Test j^erformance and Fsychological Reactions to Testing 

Marilyn F. Johnson, David J. tolas, and J. Stephen Freatwood 

February 1981 

The study investigated the Joint effects of knowledge of results (KR or no-KR), 
pacing of item preaentatlon (computer or self-pacing), and type of testing 



ERIC 



22 



I 



- 19 - 



strategy (50-ltem peaked conventional, variable-length stradaptlve, or SO-lten 
fixed-length stradaptlve teat) on ability test perfonaance, teit Itea responae 
latency, InfonMtlon, and psychological reactions to testing. The psychological 
reactions to testing were obtained froB Ukert-type Iteis that assessed test- 
taking anxiety, notlvatlon, perception of difficulty, and reactions to knowledge 
of results. Data were obtained froa 447 college students randomly assigned to 

one of the 12 experlaental conditions. | 

<> . 

The results indicated thait there were no effects on ability estiaates due' to 
knowledge of results, testing strategy, or pacing of item, presentation. Al- 
though average latencies were greater on the stradaptlve tests than on the con- 
ventional test, the overall testing tlas m not substantially longer on the 
adaptive tests and Bay have been a function of differences In test difficulty. 
Analysis of Inforaation values Indicated higher levels of Infonaation on the 
stradaptlve tests than on the conventional test. There was no statistically 
significant main effect for any of the three experinental conditions when test 
anxiety or teat-taking aotlvatlon were the dependent variables, although there 
were some significant interaction effects. 

These results indicate that testing conditions «ay Interact in a complex way to 
determine psychological reactions to the testing environment. The interactions 
do suggest, however, a somewhat consistent standardising effect of KR on test 
anxiety and test-taking motlvatllon. This standardizing effect of KR showed that 
approximately equal levels of motivation and anxiety were reported under the 
various testing conditions when KR was provided, but that mean levels of these 
variables were substantially different when KR was not provided. Consistent 
with theoretical expectations, the conventional test was perceived as being 
either too easy or too difficult, whereas the adaptive tests were perceived more 
often as being of appropriate difficulty. 

The results concerning the effects of KR on test performance, motivation, and 
anxiety found in this study were contrary to earlier reported findings; and dif- 
ferences in the studies are delineated. Recowaendations are made concerning the 
control of specific testing conditions, such as difficulty of the test and abil- 
ity level of the examinee population, as wll as suggestions for the further 
analysis of the standardieing effect of KR. 

Research Report 83-1 
Reliability and Validity of Adaptive and Conventional Tests 
In a Military Recruit Population 
John T. Martin, JamB R. McBride, and David J. Weiss 

January 1983 

A conventional verbal ability test and a Bayesian adaptive verbal ability test 
were compared using a variety of psychometric criteria. Tests were adainistered 
to 550 Marine recruits, half of whom received two 30-ltem alternate forms of a 
conventional test and half of whom received two 30-1 tern alternate forms of a 
Bayesian adaptive test. Both types of tests were computer adainistered and were 
followed by a 50-ltem conventional verbal ability criterion test. 

The alternate forms of the adaptive test resulted in scores that were much more 

23 



ERIC 



- 20 - 



sUillar in oeane variances than were the conventional tests for which nost 
aeena and variances fer/varioue test lengths ware significantly different. 
Adaptive testii^ resulted in significantly higher alternate fot«8 reliahility 
correlations for all teat lengths through 19 iteasj reliability of a 9-item 
adaptive test m equal to that of a 17-itea conventional test. Validity corre- 
lations Here higher for the adaptive pr^feedure for all teat lengths. Validity 
of an U-item adaptive test was equal Oo. that of a 27-ltea convent ion al test, in 
spite of lower diecriadnating iteas hetng used, on the" average, hy the adaptive 
tests in coBparison to the conventional test. Very few of the recruits had dif- 
ficulty in responding to the emputer-advinistered instructitms on use of the 
testli« tenainals. Analysis showed soae differences in test duration between 
the two to«;:-» strat^les; where they occurred, they were explained hy the 
ability level of tlie exsnittees. I.e., higher ability exaainMS who were adainis- 
tered adaptive tests received «»re difficult Iteae and therefore had signifi- 
cantly longer testlog^laea. Combined with reduced test length for the adaptive 
test to obtain siailar reUabilitles snd validities to the conventional test, 
however, the slight increases observed in adaptive testing tine were negligible. 

The data support the feasibility qf adaptive testing with military recruit popu- 
lations and support theoretical predictions of the psychoaetrlc superiority of 
adaptive tests in comparison with number-correct scored conventional tests. 
(AD A129324) 



ERIC 



DisTRiwtiow List 



I 9r« %rtf«r V^eliracli 
BnvirMMdC^I Strt«« ProfrMi C««cttr 

fMtCYl lUfMrch tMtltut* 
MthtftdA, no 20014 

t Dr« Nftrvl S. tsiur 

Moiia, CA 92152 

I Uatton SclOTCltt 
Office 9f n^wl llBfvarcti 
BrifttYi If f lc«, UmdiM 
»M )9 

rra 1^ fork, iiY onto 

WML 

Ort«aio, ru 12811 

I Or. iotert Carroll 
nwm 115 

lltt«}ilti^CM , OC 20170 

I Chief of lUfel M«e«cto!i eM Treittlwi 
L14WM 'Hflee 

0|^rati<N^ Treintn^ M^^leioii 
IflUlAHS ATS. AC 65224 

1 Or. St«fil«y Oollf«r 
Office of Ravel TbcHoolo^ 
BOO 8. QttlK? Street 
ArlliMltoii, f4 22217 

I cm nik» Ciirrea 
Office of Nevet leeeerch 

800 <hiocf 8t. 
Cole 270 

Arlli^coa, V% 22217 
t Or. Ooug Oevle 

ctnnr 

FooeecoI«« ft 

I Or. Toei Ouff? 
fisvy F«^r»enfi«l WO Center 
%mn Dlef^o, C% 92152 

I Mike lawyer 
lostructiooei Program Oevelop^ot 
Building ^0 
IIET-FDCO 

Greet UHea HTC. IL 

I Or. Ricliir4 Eleter 
Oapercoent of AteinietretUe Sciencee 
ll«vel foetgreHiuete Setiool 
Htonteritf , « 9 WO 

I DB. PAT ref^ICO 

Code rn 

NFBOC 

See 01 ego » C4 92152 



I Or. CetHf Femeeiee 
Ktvf pereottoel BiQ. Center 
9«e Oleiie, C% 92IS2 

I Or. JTM teilMi 
Co4e U 

Bevf Pereomel B « 0 Center 
Ben 01H9* CA 92152 

1 Or. M tetchlw 
Berf feroemel B&O Ceoter 
Sm MofOt CA 92152 

1 Or* Moraeo J* Borr 
CSimi of itevel Techalcel TreiolM 
If aval Ait Umtiw Haaiptila (75) 
HlllloBto«t til iiOM 

t Or* Leoaard BrMter 
Bavf PerBOOMl UO Seoter 
Sao 01efo« CA 92151 

1 INr* IKltioB I. flalOT (02) 
Oilof ef 9airel BdMatlm ra8 TreUlitB 
Bavei AfikStatlm 
PeMeeotoT PL nvm 

1 Dr. Jme YfcBrl4e 
Bevy Nrecmoet BU Ceoter 
Bm Oiefo, CA 92152 

1 Orjntlin ^ti^tiie 
I^BOC C^e II 
Sao Dlefo, CA 92152 

1 Bill Ber4teoek 
1052 Pelrl^ Ave. 
Ubertrviilet a «»0A8 

I Lllirarr« Code P201t 
Bevy PeraooEitei BAD Ceoter 
Sao W^o, CA 92152 

1 Toclmleel Director 

Bavy PerocNioel BAD Center 
Sao Diego, CA 92152 

A Parooooel A TreladoB leeoercli Oroo^ 
Oe^ 442PT 

Office of Bevel Beeeerch 

Arllostoti, PA 22217 

1 Spociel Aeet. for B4tteetloo ao4 
Treloloi (OP-OIB) 
Bo. 2705 Arlin^ltoo Aonex 
¥aehlnftoo, DC 20170 

I LT Preok C. PeclM, KSC, OBB (Ph.D) 
CWT (B--A12) 
OAS 

PeneacGle. PL 12508 

I Or* Beraerd Bi«len4 ^OIC) 
Bevy Pereoonel BAD Center 
Sen Ologo. CA 92152 

I Or. Carl l^iee 
CI8BT-PDCD 
iulUlog 90 

Groat Leiwe BTC, lU AOOBA 

I Or* Bobort G« S^lth 
Office of aief of Bevel Operetiooe 
W-987B 

tfaoHiogtoa» DC 20150 



I Dr. Alfrai P. ^iti^ Director 
Treloltt Analyaie A Eveltiotioo Groop 
De^. ol tlm Bevy 
Orlao^t 12811 

t Or. Blchird Soreneen 
Bevy Porooooel BAD Ceoter 
$m CA 92152 

*\ Or. ProdorlcB Stelnhelear 
cro • C»»115 
Bavy Aoooa 
Arliogtoo, PA ^170 

1 Mr. Bra4 Sywoaon 
Bevy Kra<Mmel BAD Center 
feo B|ego« CA 92152 

I Or. Pr«e* flcieo 
Bevy Parooooel BAD Cooler 
Boo Ologo, CA 92152 

I Or. Bdoard Ifefoan 
Office of Boval Baaeercli (Cede AllBAP) 
WO BortB Qol^y Street 
Arlli^toO\ 9A 22217 

1 Dr. BmU IfeleoMi 
Bevel Poatgretoate 9cfiool 
DeiMrtttoot of Adaiiniotretive 

Seio^ea 
Booinrey. CA 919%0 

t Or. DMglae Ifatsel 
code 12 

Bevy Pereooctel BAD Center 
San Diego. CA 92152 

1 OB. UttTIB P. BfSB^P 

wn psgsoBm ha o cbbtbb 

SAB OIBGO. CA 92152 

I Br lotm 1. Volfe 
Bavy Pereofioel BAD Center 
Ben Ologo, CA 92152 

1 Dr. Wallace ilaifeck. Ill 
Bavy Pereoonel BAD Center 
Sen Ologo, CA 92152 

garine Ooroe 

1 B. fllllleo Oreeno^ 
Bdocaclon Adrteor (B01I) 
BdocaclOQ Center, gCDfiC 
qua^tico* PA 22114 

I Director, Office of Manponer Uclllaatlo 
^. Marine Cor^e (fOHf) 
BCB. Bldg. 2QW 
Quant ico. PA 22134 

1 Beedgitortere. U. S. Marine Corpa 
Code MP1-20 
Beeliiogton. DC 20180 

i Special Aeolatane for Marine 
0»rpe NatMre 
Code lOOM ^ 
Office of Maval Beeeerch 
mo g. Qulncy St. 
ArllAgton, PA 22217 

1 DB. A«L. SLAPBOSmr 
KXBBTIPIC ADVISm (CCH« RO-l ) 

gq, o.s. fMim CORPS 

MSBlBOm. DC 20180 



^5 



^«4^rt«n» itariM Corp« 
llMlllii^tM, DC ZOliO 

e. 8. 4niT taMAreh lose 1 tut* for 
••Mvloral md fki€i«l Sclmc«« 

I^MKtdrU, n 2211) 

4la«n4rU. f% 22111 

t Dr. Mnt Eitm 

mil BUo^iwvr Mlvd.-^ 
Mm«iHrU , V% 22111 

1 Dr. Df«cric« J. fmtt 
D. S. 4c«f t«March Imtiwta 
9D9I Cltftidmitr %vmiM 
AlMMdrU. f% 22111 

I Dr. Nfroi^ FIkIiI 
U.S. ktmf lUmmreli fMeit«t« for tl« 
S«>cUI ^ ioHavloral ScimcM 
5001 8ls«^i0ii9v AvMOT 
MmMdrU. V% 22111 

1 Dr. fliitoii S« K«ci 

0.^. hrmf IS««mreti tMCltato 
MOl BlMiihowar AirmtM 
Umatidrls, n 22^11 

I Dr. t^tnld r. O'Dell, Jr. 
Dinctor, Tr«Ui«HI Rm^rcH Lab 
%r«T 1l89«arcli Ia«tlCtt€« 
MO* BlowlmMir Avmw 
amiMdrU. 22111 

t Covmriar. D.S. Ar^ ItMsrcli lMtlt«t« 
for tlM Dolisviorsl k Social Selooeoo 
mnx PERl^M (Dr. Jii4ick Oraowm) 

AlOKffiidrla. m 20111 

1 jM09h Piock«« Ph*0« 
Am: mi-ic 
Ar^y ll«««trch lottltiitc 
KlOt BU9nh9iMr Ave. 
Aloat^rU. VA 22111 

1 tfr. iolMrt Iboo 
D.S. Ai»r fHawrch lnotltota f»r rto 
Social <^ Baliavloral Scleocoa 
5091 BifcmlicHfor Avofitta 
AioxaiHrlB. VA 22111 

I Dr. tolmc Saamr 
D, $• Arof Baae^reh la«Cltiie« for cha 
B^hartoral aii4 Social Sciaacaa 
5001 BlaanHoMr Avanua 
Alaaairtrta, fA 22111 

I Dr« Joyca Shlal4« 
Ar«r EaaaarcH laatituto for Cha 
V^Hav I oral «^ Social Sclancaa 
5001 Eta«ftHowr AvamHi 
Alomimdrla. VA 22113 

I Dr. Hilda tfimc 
Affwy Soaaarcii mitititca 
SOOl BiaanlMMiar Ava. 
Alaaaadrla, VA 22111 



I Dr. l^ort; Wialiar 
Arvf DMiaareli tiwtieitca 
5001 ilaoatoMr Avmaa 
AlaaaadrU. VA 22111 

Air Porco 

I Air Porca Anaa Daaoareaa Uti 
^mitL/HPD 

iro«ta APB, TK 78215 

1 Taetaieal Doeifmta Gaotar 
Air Vorca Rsaim lamrcaa Uboratorr 
WAfi. Oil A541S 

I 0.8. Air fprtm MiUm of Seiaatific 
laaaarcli 
Ufa SeUaoM Diraecora^a. li. 
iolllof Air Porca Um 
VMliiaEtm, DC »112 

I Air Dalvaraltf LUrtary 
m.nM 28/AAS 
tCmiall iff, AL M112 

I Barl A« Allatai 
»). Atm (ATSC) 
irooia Afl, TX 78215 

I fir. Bafwad B. Chrlacal 

Brooto APB. TX 78215 

1 Dr. Alfrad B. Praflf 
AfOSB/lfL 

Sollls^ APB. DC 20112 

i Dr. Oaaaviava BaMad 
Pre^aa Htaagar 
Llfa 8cl«aicaa Dlnmtorota 
APMB 

Boll^MI APB. DC 20112 

1 Dr. T. H. tonfridfa 
APHBL/OTB 

Viiliam APB^ AE 8522A 

I itr. Baadolpli Park 
APm/MOAM 
Brvcte APB. TX 78215 

I Dr. BoBar Pamall 
Air r^rca Ruaa DaaoorMa Utoratarf 
lAWf? APB« 00 802% 

I B r. B alcola Baa 
APHBL/W 

Braaka APB. TX 78215 

I 1700 Tcarv/TTcn 

2Le Tailarlto 
Saiappard APB, TX 76111 

I Lt. Col JMoa B. Vataoa 
IK) DSAP/WPH 
tha Panta^(«» 
tfaalilf^itoa. DC 20110 

I ffalor Jolia Valali 
APMPC 

SMdolyli APB* TX 

I Dr. JoMph TaaatiAa 
APCBtL/UET. 
iowy APB, CO 80210 



D»f arnaoc of Dt?aaaa 

.12 D9fanaa'TacNiical laforamlm Canter 
CMarm Bcatloa. Bld^ 5 
Alau^ria,. PA 221tA « 
Accat TC 

I Dr. CralC I. Plalda 
MiTMcad SMtaareh Prof acta Aiiaiicf 
f AM miMa Blvd. 
ArllimlM, PA 22209 

i Jntty U^imm 
HSPDPI 
Atta? f«FCT-P 

Port ^r&4ra. a 80917 

* 

I unitary Aaaiataac for Tralali^ aad 
^raonaal T^clmaloff 
Of flea of tha Oadar Sacracary of Dofana 
for teaaar^ A B^iaaariat 

BfiPMs 19128. THm Paatafoa 
Haalilaiitaai, DC 20101 

i Dr. Wayaa SoIUm 
Mf lea of tlia Aaaiataat Sa«ratary 
af Dafaaaa (ISU i 
2B288 TIM Ptata^ea 
liarfilavtoa, DC 20101 

I *f«Jor Jack Tkorps 
DMPA 

lAOO tfllaoa Blvd. 
Arllaiftoa. PA 22209 

Clvlllaa A^eaciaa 

I Dr. Stiaam Qilpaan 
UiaralfHI Md Davalapatat 
Batlanal laatltata of Bdacatioa 
1200 IPtli Street m 
Hawhla^taa, DC 202M 

I Dr. Pam 9. Orry 
Paraoaaal BAD Caotar 
Office of Peraoaml IUMfe«ent 
1800 B Street BV 
lfMhlaftoa« DC 20AI5 

I Nr. Tliosae A. Hara 
0. S. Coaet Oiaard laatltata 
P. 0. Satietatlm 18 
Q^l^hamm Cltj. OB 71188 

1 Dr. Joeep*! L. Totmg, Director 
!ta»ry A Cogaltlra Procataaa 
Bttiooal Sciaoco Powidatioa 
Vaaliii^toa, DC 20550 

Private Sector 

I Dr. JiMa Algina 
Oalvaraiey of Plorlda 
Calaaarllle. Pt 128 

1 Dr. Erllag B* Andersen 
INRortMnt of Statlotlcf 
Stodieetraeda 8 
t455 <^paahagaa 

t I PeycholOfi^al Baaearcfe Ifnit 
BBB-I-AA Attac Llbrarlaa 
BortHbo«inie Bawie 
Taraar ACT 2801 
lUTSTBALlA 

I Dr. leeac Be jar 
Bdaeatloaal Taetian Sanrica 
Priacatofi. BJ <^50 



ERIC 



26 



I Dt, t^KcM llmteif« 
Scte«illf tilMaCim 
tni Aviv itaiMfsicy 
fal Aviv, UMt Aviv A9978 

i Or« t. Darrttli Bock 
OttpartMAt of Mac«cioii 
tfaivcrsicy of Cliieafo 
€Siie««o, Ih Mft17 

1 Dr. Kobort Irocmsfi 
taorlcra Cotlvfa Tattitii frogrm 
P. 0. Bo« 

tmn City, U 52241 

t Dr. inmt !• CsdotCe 
107 Stotelf 

jOAiV«rOitT of TOAMiOM 

l^iomvilU, n 17916 

I Or. Jolm B« Csrroll 
40^ Blliott M. 
CHmp^l Bill, K 271U 

1 Dr. DorMii Cliff 
Dvpc. of fsyetoloffT 
Univ. of So* CAliforaio 
Dfiiv«roity P^rk 
Loo MioloOt CA W007 

I Dr. R«a» Crosbtg 
Edococioo BftOMreh Coator 
DnivoroiCf of Ufdmi 
BoerhfovolAon 2 
2114 SH Uyion 
Tho miiStXAIiDS 

I Or» Doctfirododl Divgi 
$yroct»« Dnlvoroity 
Do^rcwrat of PoyclioK^ff 
Syrocooa, IB 11210 

I Dr. Frict Droogoif 

DofMrCMiit 6f Poyehalofnr 
Uoiveroitf of lllioois 
401 DoQlol St. 
Cta^ign, IL 61820 

1 Dr. Sootti BtbcrtooQ 

wimsm or mmsas 

Umooce, K9 4604) 

1 ERIC focility-Acguiiltioo* 
4811 Rugby Avoottfi 
Botlioodia, IfD 20014 

1 Dr. i<?ni«i« 4* Fairbiokt 
HcFimi-^ray 4 4««oci«t«o, loc* 
1821 Ctlloghoa 
Suite 221 

S%n 4ntonio, ft 78228 

1 Dr, Leonird Feldt 
Llodquiat Center for H««*arMAt 
Onivorolty of lovi 
lovo City, lA 52242 

I Dr. Richard t. Fargooon 
Th« Aiverlcoo College Totting FrogrM 
F.O. 8ox 168 
I6%m City* lA 12240 

I Univ. Frof. Dr, C«rhar4 Fischer 
LUbiggaooo 1/1 
A lOtO ViooM 
AUSTRIA 

I Frofeooor Doiiold FUftger«I4 
Ifoivoroity of fNv BiiglMd 
Ar«ii4aU. *iev South Voloe 2111 



t Dr. Dexter Fletcher 
tflCAT Roeeorch InetitD^te 
1871 8. State St. 
Oro«, trr 22111 

1 Dr. Jeaiee Glfford 
Dnlveroity of fteeeectmeette 
School of Sdofiotioo 
4^ret. 44 Oimi 

1 Dr. Robert Oleeer 
Leoroing ReMerch 4 DevelofMnt Center 
Doiveroity of FittRtergh 
mt 0*Sere Street 
FXfTSraira. FA 15260 

1 Dr. Rort Oreeo 
Johoe l»i|iase Ooiveroity 
DeD«rtM»t of Ferchoinnr 
ChetlRR 4 14tb St root 
Reitisore. KD 21218 

1 Dr. 1^ fte«iilece« 
School of t4oeecloa 
Doiveroity of KeeeeclMaette 
Awheret. tlA 01002 

I Dr. Del«ya Bamiech 
Doiveroity of llliooie 
242b Bdocetioe 
Drbeae. IL 61^1 

1 Drt Feul Boret 
677 6 Street. #184 
Chyle ^itto. CA 90010 

1 Dr. Lloyd mm^^W 
Dep^rtMoC of Feyehol^ 
Doiveroity of llliMie 
601 Root neoiel Stroet 
Aoepeiffo« it 6mo 

1 Dr. Jech Btmter 
2122 Coolidge St. 
LeoeifMI, HI 48906 

1 Dr. feyoh Bn^h 
ColloBR of Riocetioa 
Doiveroity of Sooth Ctrolioo 
Coloibie, SC 292M ^ 

1 Dr. Dmgloe B, MSim 
Aiveiicod Stetiotlcol Tectaolo«iee 
OorfKirotioii 
10 Trofclgor Court 

LetfTMceville* BJ 0814t 

I Frofeoeor Johe A. Beete 
DoF«rtMot of Feyehol<H|y 
The Doiveroity of Beoceetle 
B.9.lf. 2108 
AimRALlA 

1 Dr* tfillieo Roch 
Onlvereity of Tevee-Aoetlti 
fleeeoreeMoc eed Rveloetlm Center 
Austin , n 78701 

1 Dr. Alan Uoe^ld 
Leeraing BAD Center 
Onlvereity of Fitteburgh 
1919 0*Rere Stroet 
Fittoborgh* FA 15260 

1 Dr. tUeheol LovIimi 
Depertvent of Bdacetlonel Peychology 
210 BdttCotiffiB BUg. «^ 
Dnivnreity of liliooie / 
Chn^ign. a 61801 




1 Dr. Cherloe tmmf 
Fftcalteit SocieU Ueteoeehi^fHWO 
Ri|lumniverelteit GrooioRon 
Dote RotorlQ^eetroot 21 
9712GC Grooi«(en 
MthorleBiB 

1 Dr. Robert lion 
Col logo of Rdt^etion 
Dnivorolty of llllnole 
Drbooe, IL 6180t 

1 Hr. FhilliF Liviogoton 

SyetMO nod Allied ScIomm Corirarotio 
6811 Renlliiorth Aveim 
Rleordele, BD 2(^40 

1 Dr. Sober t LoctOino 
Center for Bevel Aii*lyRi« 
200 Berth Reeor^ord Sc. 
4le«AdrU, VA 22111 

1 Dr. Frederic ff. Lord 
Rdosetionel Tooting Service 
Frioceton. BJ 08541 

1 Dr. Jmoo Ucnedeo 
DeFurtnent of Feychoiogy 
Onlvereity of Btoeem Anetrelin 
Btdlendo B.A. 6009 
A^TRALIA 

I Dr. dry Hirco 
Stop 11 -R 

Rducetion^l Teetiiui Servico 
Frioceton, BJ M451 

1 Dr. S^ott Benwsll 
Depertneot of Fefchology 
Itolvereity of Betre Done 
Botre D«e, IB 46556 

1 Dr. Smel T. Bayo 

Loyole Dniveraity of Chtcego 
820 Berth Bichlgon Avono^ 
Chici^. IL 

t Br. Robert NsRlnloy 
AMriceo College Tenting Fmgrene 
P.O. Ron 168 
loM City. lA 52241 

1 Dr. Rarhnre Nsene 
Rooen Ree<rarcee Beeenrch OrgMiaitloo 
300 Berth tfeehiogtoo 
Alenendrie. VA 22114 

1 Dr. tobert Nielovy 
711 Illinoln Stroet 
Geneve. IL 60114 

1 Dr. Allen fhinro 
Rehevlorel Teclmology Leboretorlee 
1845 Blens Ave., Fonrth Floor 
Redoodo 8«ech. C% 90277 

I Or. W. Alee Bicewinder 
Dnivereity of <Bil^io*na 
Dapertnent of Psychology 
Qhlehrae City. iM 71069 

1 Dr. Belvin R. Bovich 
116 Llndgulnt Center fer Heeeuroeot 
Ooiveroity of lone 
lom City. lA 52242 

1 Or. Jsaee Oleog 
IIICAT, Inc. 

1875 Sottth Stat%f Stroet 
Oron, m 84017 



tmricm Cornell m Cdii«e9tl(ra 
ego TMCii^ 8«nric«, toUn 20 
(km OtfpMt CIrU, m 

I Or. Jtmmm k* Pmilsm 
PartlAfid St«t« Onivcrsitf 
P.O. 8oK 7S1 
fonUnd. Oft 9T207 

t .Or. fUtk 0. iAc^AM 
ACT 

f. 0. torn 168 

IMM CiCf « t% ^241 

I Or. TbottM iayaolds 
(M^mtlcf of tttsav^OnIlM 
NirtetiMi DvparMMC 
P. 0. Om %S8 
tichvrdSM. TX 79090 

I Or. l«wrMc« Siidmr 
401 lis 4v«m8 
Tdmt Farli, HO 20012 

t Or. J. Wfm 
Oifi«rc«MC of Sdocaclon 
INimr«iCf of Scmch C»rollo« 
CoifH^la, SC 2920S 

.1 PROP, wmito SMBJim 

WPT. OP PSfCiOLOCT 

miXfaU. T!i 37916 

t Pr«n^ L. Schsidt 
OcitfMi^t«?iic of P«rcHolofy 
lldg. 06 

VMhlf^CQO. DC 20052 

1 Or. V«lt«r Scfmeider 
FvyctiQlovr DBpartwnt 
W% S. Otaiel 
CHMfMlfA. a 61820 

1 LoMll Sctio«r 
PsfcHoloiieal & Oa«9tlt«tlv« 

Pomdatiom 
Coll^a of Kdiicatloii 
Ooimroltp of loiM 
loM CICf . lA 52242 

1 im. K^omr J. sfitosL 
iiBTiocmmM. TKcamocfT oroop 
immo 

Teoo «. v4WiwmN sr. 

AtlXUimU« V4 22314 



I Or. tob«rt StomlMrti 
OcH* of PofetolofT 
fmim Oftiwroltp 
Son lU, ToU SCOtloA 
Mm gftvoo, CT MS20 

i Dr. Pocer Stoloff 
Cootor for fciral Aaol^lo 
MO Sorcii lMir«|ttrd Strooc 
AlcmdrU. 2211 i 

1 Dr. IfttliM Stmt 
Itetvoroltir of ElltwU 
P opa r fm t of Itethontlco 
OrtaM* IL 61001 

I Dr. «%rilioraa Ovasiwittai 
LAorotory of P oycl w ao trio Md 
^olutiott toooorch 
S^oot of idveotiOQi 
telvoroitp of Ifmoctaootco 
IMirot, 1^ 01003 

I Dr« Ulmi ntooolu 
Omntor DaoOd OditeociM BoooMrch Imb 
in Bogloooriag teMorete Lotoratoty 
Drlwao, IL 61001 

1 Dr. lUarUo TotomM 
220 Bdoeotioo lldg 
1310 S. 8tsc^ 
Chos^ifo. VL 61020 

1 Dr. Oorid Thioooo 
DoportMEic of PipcHoiofp 
ftelirorottp of fonom 
iMTWCB^ MB 660**4 

I Dr. ftotert Toatikmi 
DtpartMot of Stotiotieo 
Oalwraltf of Itloooorl 
OsIoiMo. HO 65201 

1 Or. J* imiMor 
imimor Oooodltoota 
4258 8oo«vlto Orivo 
Sociao, C* 91436 

1 Dr. f . ft« i« IH»Pol«ri 
UnlM C«rbido Oorfporotioo 
Waeloor Dlvioioii 
p. 0. 8m f 
0^ SidfO. TO 37830 

1 Or. Dtrld 9oU 
Aooommt 9potom <^rporotiMi 
2233 Boivoroitr AirooM 
Soito 310 
8t« P^« MO ^5114 



1 1foif8«il| VtidknArt 
Stroltlirooftoat 
BM 20 M 03 
D-5300 Ooo o 2 
tIBST OSmMT 

1 Or. 8n«o WUtw 
OoptrlMt of IdMotioool PiycholoiP 
Ooivoroicp of llUMlo 
Orim, a oimi 

1 Or. Vrndt tM 
cn/NeOrM mil 
Do I Homo DMoareli Porli 
HMtorop. C4 93940 



1 Or. USOQ OtllfMMU 

l^lroroity of ToHdw 

Oopsrtvmt of Edttc^tlonol Popetaoie«y 

KMOUClii. S^t^al 980 

J4PM 

1 Or. Bdirlo mrkef 
OoparCMot of Porctology 
QnlvoroUy of Caotral Plorida 
Orlando. PL 12816 

1 Or. Siwm 
Coficar for Haral 4oalya» 
200 iterth Boaitrofsrd Straat 
Aloaandrla, PA 22111 

1 Or. H* tfaiiaco Sioalko 
Progras Di roe tor 

KaofKimr Oaoaareli and M^riaorp Sorvieaa 
Svltliooiiiaa iBatitutioo 
801 Wortli Plct Sctoat 
Aloaatidrta, PA 22314 

O 

ERIC 



1 Or. Sovard Valoar 
Qlviaimi of Papclsologieal Stodloa 
Bdiieatlmvl TOattog Sorvico 
PrlAcaton* RJ 09540 

1 Or. NietiMl T. Vallor 
OsimrtM^ of OdMatloMl Papetwloop 
Oalvoraitp of Vlacoaaiit—Hiloasdwa 
miv^Mia, m 53201 

I Or. iriao Hatara 

3W Itorth VaahlBgcos 
Alaxaodria. PA 22314 

I Or. Mmi t. fflleoa 
Oalvoraitp of Oootlioro Califorala 
OapartMit of Popctolofy 
U»a Aagolm. C4 M007 



28 



Previous Publications 



Proceedinga of the 1979 Coaputerized Adaptive Testing Conference. 

Septeaber 1980 ^ ^ 

Proceedings of the 1977 Computerized Adaptive Testing Conference. 

July 1978. 

Research Beports 

83-3. Effect of Bcaalnee Certainty on Probabilistic Test Scores and a Comparison 

of Sboring Methods for Probabilistic Besponses. July 1983. 
83-2. Bias and Information of Bayeslan Adaptive Testing. March 1983. ^ 
83-1. tellabillty and Validity of Adaptive and Conventional Tfests in a Military 

Itecruit Population. January 1983. 
81-5. Dimensionality of Measured Achievement Over Time. December 1981. 
81-4. Factors Influencing the Psych<»Btric Characteristics of an Adaptive 

Ibsting Strategy for Test Batteries. !love«ber 1981. 
81-3. A Validity Comparison of Adaptive and Conventional &rategles for Mastery 

Testing. SbptCTber 1981. 
Final Report: Computerized Adaptive Ability Testing. April 1981. 
81-2. Effects of Imdiate Feedback and Pacing of Item Presentation on Ability 

Tfest Performance and Psychological Reactions to Testing. February 1981. 
81-1. Review of Itest Theory and Mettods* January 1981. 

80-5. An Alternate-Forms Iteliability and Concurrent Validity Comparison of 
Bayeslan Adaptive and Conventional Ability Tests. December 1980. 

80-4. A Comparison of Adaptive, Sfequcntial, and Conventional Testing Strategies 
for Mastery Decisions. Movember 1980. 

80-3. Criterion-Related Validity of Adaptive Ttesting arategles. June 1980. 

80-2. Interactive Computer Administration of a spatial Reasoning Test. April 
1980* 

Final Report: Computerized Adaptive Performance Rraluation. February 1980. 
80-1. Effects of Immediate Knowledge of Results on Achlevanent Test Performance 

and Ttest Dimensionality. January 1980. 
79-7. The Person Itesponse Curve: Fit of Individuals to Item Characteristic Curve 

Models, December 1979. 
79-6- Efficiency of an Adaptive Inter-SUbtest Branching &rategy in the 

Measurement of Classroom Achieveront. November 1979. 
79-5- An Adaptive Ttesting Strategy for Jfastery Decisions. &ptember 1979. 
79-4. Effect of Point-ln-Tlme in Instruction on the Measurement of Achievement. 

August 197''. * T 

79-3. Relationships among Achievement Level Estimates fr<Mi Three Item 

Characteristic Curve Scoring Methods. April 1979. 
Final Report: Bias-Free Computerized Testing. March 1979. 
79-2. fffects of Computerized Adaptive Testing on Black and White audents- 

March 1979. • ^ ^ 

79-1. Computer Programs for Scoring Test Data with It^ Characteristic Curve 

Models. February 1979. . ,o-id 

78-5. An Item Bias Investigation of a Standardized Aptitude Test. December 1978. 
78-4. A Construct Validation of Adaptive Achievement Testing. !tevember 1978. 
78-3. A Comparison of Levels and Dimensions of Performance in Black and White 

Groups on Ttests of Vocabulary, Mathematics, and %>atlal Ability. 

October 1978. 

-continued Inside- 



29 



Previous Publications (continued) 



78-2. The Effects of Knowledge of Results and Test Difficulty on Ability Ttest 

Ferfoitaance and Psychological Reactions to Testing, teptenher 1978. 
78-1. A Comparison of the Fairness of Adaptive and Conventional Ttesting 

Strategies. August 1978. 
77-7. An Information Comparison of Conventional and Adaptive Tests in the 

tfeasureisent of Classroom Achievement. October 1977. 
77-6. An Adaptive Testing Strategy for Achievement Test Batteries. October 1977. 
77-5. Calibration of an Item Pool for the Adaptive Measurement of Achievement. 

September 1977- 

77-4. A Rapid Item-Search Procedure for Rayesian Adaptive Testing. May 1977. 

77-3. Accuracy of Perceived Test-Item Difficulties. May 1977. 

77-2. A Comparison of Information ifunctions of tfciltiple-Cholce and Free-Response 

Vocabulary Items. April 1977. 
77-1. Applications of Computerised Adaptive Testing. March 1977. 

Final Report: Computerised Ability Testing, 1972-1975. April 1976. 
76-5. Effects of Item Characteristics on Test Fairness. December 1976. 
76-4. Psychological Hfects of Immediate Knowledge of Results and Adaptive 

Ability Testing. June 1976. 
76-3. Effects of Immediate Knowledge of Results and Adaptive Testing on Ability 

Test Performance. June 1976. 
76-2. Effects of Time Limits on Test-Taking Behavior. April 1976. 
76-1. Some Properties of a Bayeslan Adaptive Ability Testing ftrategy. March 

1976. 

75-6. A Emulation Study of Stradaptive Ability Testing. DecCTber 1975. 

75-5. Computerized Adaptive Trait Measurement; Problems and Prospects. November 

1975. ^ ^ 

75-4. A Study of Computer-Administered Stradaptive Ability Tfestlng. October 

1975. , , 

75-3. Hnplrical and Simulation Studies of Flexilevel Ability Testing. July 1975. 
75-2. TETREST: A FORTRAN IV Program for Calculating Tetrachorlc Correlations. 

March 1975. 

75-1. An Bapirical Comparison of TWo- Stage and Pyramidal Adaptive Ability 

Testing. February 1975. 
74-5. Strategies of Adaptive Ability Measurement. December 1974. 
74-4. Simulation Studies of TWo-Stage Ability Testing. October 1974. 
74-3. An bplrlcal Investigation of Computer-Administered Pyramidal Ability 

Testing. July 1974. 
74-2. A Word Knowledge Item Pool for Adaptive Ability Measurement. June 1974. 
74-1. A Computer Software ^stem for Adaptive Ability Measuresent. January 1974. 
73-4. An Bapirical Study of Computer-Administered Two- Stage Ability Testing. 

October 1973. 

73-3. The Stratified Adaptive Computerized Ability Test. September 1973. 
73-2. Comparison of Four Bapirical Item Storing Procedures. August 1973. 
73-1. Ability Measurement: Conventional or Adaptive? February 1973. 

Copies of these reports are available, while supplies last, from: 
Computerized Adaptive Testing Laboratory 
N660 Elliott Hall 
University of Minnesota 
75 Bist River Road 
Minneapolis M» 55455 U.S.A. 



30 



